智能论文笔记

Masader Plus: A New Interface for Exploring +500 Arabic NLP Datasets

Yousef Altaher , Ali Fadel , Mazen Alotaibi , Mazen Alyazidi , Mishari Al-Mutairi , Mutlaq Aldhbuiub , Abdulrahman Mosaibah , Abdelrahman Rezk , Abdulrazzaq Alhendi , Mazen Abo Shal

分类：自然语言处理

2022-08-01

Masader（Alyafeai等，2021）创建了一种元数据结构，用于分类阿拉伯NLP数据集。但是，开发一种简单的方法来探索这种目录是一项艰巨的任务。为了为探索目录的用户和研究人员提供最佳体验，必须解决一些设计和用户体验的挑战。此外，用户与网站的交互可能提供了一种简单的方法来改善目录。在本文中，我们介绍了Masader Plus，该网络接口供用户浏览masader。我们演示了数据探索，过滤和简单的API，该API允许用户从后端检查数据集。可以使用此链接https://arbml.github.io/masader探索masader plus。可以在此处找到的视频录制说明界面的录制https://www.youtube.com/watch?v=setDlseqchk。

translated by 谷歌翻译

Deep Learning Approach for Aggressive Driving Behaviour Detection

Farid Talebloo , Emad A. Mohammed , Behrouz Far

分类：机器学习 | 人工智能

2021-11-08

驾驶行为是道路崩溃和事故的主要原因之一，可以通过识别和最小化攻击性驾驶行为来减少这些原因。本研究确定了当不同情况下的驾驶员（匆忙，精神冲突，报复）开始积极推动时的时间戳。需要观察者（真实或虚拟）来检查驾驶行为以发现攻击性驾驶场合;我们通过使用智能手机的GPS传感器来检测位置并每三分钟分类驱动器的驾驶行为来克服这个问题。为了检测我们数据集中的TimeSeries模式，我们使用RNN（GRU，LSTM）算法来识别驾驶过程中的模式。该算法与道路，车辆，位置或驾驶员特性无关。我们得出结论，三分钟（或更多）的驾驶（120秒的GPS数据）足以识别驾驶员行为。结果显示出高精度和高F1分数。

translated by 谷歌翻译

LMFLOSS: A Hybrid Loss For Imbalanced Medical Image Classification

Abu Adnan Sadi , Labib Chowdhury , Nursrat Jahan , Mohammad Newaz Sharif Rafi , Radeya Chowdhury , Faisal Ahamed Khan , Nabeel Mohammed

分类：计算机视觉 | 人工智能

2022-12-24

Automatic medical image classification is a very important field where the use of AI has the potential to have a real social impact. However, there are still many challenges that act as obstacles to making practically effective solutions. One of those is the fact that most of the medical imaging datasets have a class imbalance problem. This leads to the fact that existing AI techniques, particularly neural network-based deep-learning methodologies, often perform poorly in such scenarios. Thus this makes this area an interesting and active research focus for researchers. In this study, we propose a novel loss function to train neural network models to mitigate this critical issue in this important field. Through rigorous experiments on three independently collected datasets of three different medical imaging domains, we empirically show that our proposed loss function consistently performs well with an improvement between 2%-10% macro f1 when compared to the baseline models. We hope that our work will precipitate new research toward a more generalized approach to medical image classification.

translated by 谷歌翻译

End-to-end AI Framework for Hyperparameter Optimization, Model Training, and Interpretable Inference for Molecules and Crystals

Hyun Park , Ruijie Zhu , E. A. Huerta , Santanu Chaudhuri , Emad Tajkhorshid , Donny Cooper

分类：人工智能 | 机器学习

2022-12-21

We introduce an end-to-end computational framework that enables hyperparameter optimization with the DeepHyper library, accelerated training, and interpretable AI inference with a suite of state-of-the-art AI models, including CGCNN, PhysNet, SchNet, MPNN, MPNN-transformer, and TorchMD-Net. We use these AI models and the benchmark QM9, hMOF, and MD17 datasets to showcase the prediction of user-specified materials properties in modern computing environments, and to demonstrate translational applications for the modeling of small molecules, crystals and metal organic frameworks with a unified, stand-alone framework. We deployed and tested this framework in the ThetaGPU supercomputer at the Argonne Leadership Computing Facility, and the Delta supercomputer at the National Center for Supercomputing Applications to provide researchers with modern tools to conduct accelerated AI-driven discovery in leadership class computing environments.

translated by 谷歌翻译

Artificial intelligence-driven digital twin of a modern house demonstrated in virtual reality

Elias Mohammed Elfarri , Adil Rasheed , Omer San

分类：计算机视觉

2022-12-14

A digital twin is defined as a virtual representation of a physical asset enabled through data and simulators for real-time prediction, optimization, monitoring, controlling, and improved decision-making. Unfortunately, the term remains vague and says little about its capability. Recently, the concept of capability level has been introduced to address this issue. Based on its capability, the concept states that a digital twin can be categorized on a scale from zero to five, referred to as standalone, descriptive, diagnostic, predictive, prescriptive, and autonomous, respectively. The current work introduces the concept in the context of the built environment. It demonstrates the concept by using a modern house as a use case. The house is equipped with an array of sensors that collect timeseries data regarding the internal state of the house. Together with physics-based and data-driven models, these data are used to develop digital twins at different capability levels demonstrated in virtual reality. The work, in addition to presenting a blueprint for developing digital twins, also provided future research directions to enhance the technology.

translated by 谷歌翻译

On Robust Learning from Noisy Labels: A Permutation Layer Approach

Salman Alsubaihi , Mohammed Alkhrashi , Raied Aljadaany , Fahad Albalawi , Bernard Ghanem

分类：机器学习 | 计算机视觉

2022-11-29

The existence of label noise imposes significant challenges (e.g., poor generalization) on the training process of deep neural networks (DNN). As a remedy, this paper introduces a permutation layer learning approach termed PermLL to dynamically calibrate the training process of the DNN subject to instance-dependent and instance-independent label noise. The proposed method augments the architecture of a conventional DNN by an instance-dependent permutation layer. This layer is essentially a convex combination of permutation matrices that is dynamically calibrated for each sample. The primary objective of the permutation layer is to correct the loss of noisy samples mitigating the effect of label noise. We provide two variants of PermLL in this paper: one applies the permutation layer to the model's prediction, while the other applies it directly to the given noisy label. In addition, we provide a theoretical comparison between the two variants and show that previous methods can be seen as one of the variants. Finally, we validate PermLL experimentally and show that it achieves state-of-the-art performance on both real and synthetic datasets.

translated by 谷歌翻译

Generalisability of deep learning models in low-resource imaging settings: A fetal ultrasound study in 5 African countries

Carla Sendra-Balcells , Víctor M. Campello , Jordina Torrents-Barrena , Yahya Ali Ahmed , Mustafa Elattar , Benard Ohene Botwe , Pempho Nyangulu , William Stones , Mohammed Ammar , Lamya Nawal Benamer

分类：计算机视觉

2022-09-20

大多数人工智能（AI）研究都集中在高收入国家，其中成像数据，IT基础设施和临床专业知识丰富。但是，在需要医学成像的有限资源环境中取得了较慢的进步。例如，在撒哈拉以南非洲，由于获得产前筛查的机会有限，围产期死亡率的率很高。在这些国家，可以实施AI模型，以帮助临床医生获得胎儿超声平面以诊断胎儿异常。到目前为止，已经提出了深度学习模型来识别标准的胎儿平面，但是没有证据表明它们能够概括获得高端超声设备和数据的中心。这项工作研究了不同的策略，以减少在高资源临床中心训练并转移到新的低资源中心的胎儿平面分类模型的域转移效果。为此，首先在丹麦的一个新中心对1,008例患者的新中心进行评估，接受了1,008名患者的新中心，后来对五个非洲中心（埃及，阿尔及利亚，乌干达，加纳和马拉维进行了相同的表现），首先在丹麦的一个新中心进行评估。）每个患者有25名。结果表明，转移学习方法可以是将小型非洲样本与发达国家现有的大规模数据库相结合的解决方案。特别是，该模型可以通过将召回率提高到0.92 \ pm 0.04 $，同时又可以维持高精度。该框架显示了在临床中心构建可概括的新AI模型的希望，该模型在具有挑战性和异质条件下获得的数据有限，并呼吁进行进一步的研究，以开发用于资源较少的国家 /地区的AI可用性的新解决方案。

translated by 谷歌翻译

Talking Head from Speech Audio using a Pre-trained Image Generator

Mohammed M. Alghamdi , He Wang , Andrew J. Bulpitt , David C. Hogg

分类：计算机视觉

2022-09-09

我们提出了一种新颖的方法，用于生成语音音频和单个“身份”图像的高分辨率视频。我们的方法基于卷积神经网络模型，该模型结合了预训练的样式Gener。我们将每个帧建模为Stylegan潜在空间中的一个点，以便视频对应于潜在空间的轨迹。培训网络分为两个阶段。第一阶段是根据语音话语调节潜在空间中的轨迹。为此，我们使用现有的编码器倒转发电机，将每个视频框架映射到潜在空间中。我们训练一个经常性的神经网络，以从语音话语绘制到图像发生器潜在空间中的位移。这些位移是相对于从训练数据集中所描绘的个体选择的身份图像的潜在空间的反向预测的。在第二阶段，我们通过在单个图像或任何选择的身份的简短视频上调整图像生成器来提高生成视频的视觉质量。我们对标准度量（PSNR，SSIM，FID和LMD）的模型进行评估，并表明它在两个常用数据集之一上的最新方法明显优于最新的最新方法，另一方面给出了可比的性能。最后，我们报告了验证模型组成部分的消融实验。可以在https://mohammedalghamdi.github.io/talking-heads-acm-mm上找到实验的代码和视频

translated by 谷歌翻译

Visual Grounding of Inter-lingual Word-Embeddings

Wafaa Mohammed , Hassan Shahmohammadi , Hendrik P. A. Lensch , R. Harald Baayen

分类：自然语言处理

2022-09-08

语言的视觉基础旨在用多种视觉知识来源（例如图像和视频）丰富语言表示。尽管视觉接地是一个深入研究的领域，但视觉接地的语言方面并没有得到太多关注。本研究调查了单词嵌入的语法视觉基础。我们在两个视觉和语言空间之间提出了一种隐式对齐技术，其中语言之间的文本信息相互作用以丰富预训练的文本单词嵌入。我们专注于实验中的三种语言，即英语，阿拉伯语和德语。我们获得了这些语言的视觉接地矢量表示形式，并研究了一种或多种语言的视觉接地是否改善了嵌入在单词相似性和分类基准上的嵌入性能。我们的实验表明，语法知识可以改善类似语言（例如德语和英语）的扎根嵌入性能。但是，德语或英语用阿拉伯语的语言基础导致单词相似性基准的性能略有降解。另一方面，我们观察到了分类基准的相反趋势，而阿拉伯语对英语的进步最大。在讨论部分中，提出了这些发现的几个原因。我们希望我们的实验为进一步研究的基线提供了有关语法间视觉接地的基准。

translated by 谷歌翻译

Energy-Efficient Trajectory Design of a Multi-IRS Assisted Portable Access Point

Nithin Babu , Marco Virgili , Mohammad Al-jarrah , Xiaoye Jing , Emad Alsusa , Petar Popovski , Andrew Forsyth , Christos Masouros , Constantinos B. Papadias

分类：机器人

2022-09-01

在这项工作中，我们提出了一个框架，用于部署的无人驾驶汽车（UAV）的便携式接入点（PAP），以服务于一组接地节点（GNS）。除PAP和GNS外，该系统还由安装在人造结构上的一组智能反射表面（IRS）组成，以增加每焦耳的能源消耗的钻头数量，这些能量消耗被测量为全球能源效率（GEE）。 PAP的GEE轨迹是通过考虑UAV推进能量消耗和PAP电池的PEUKERT效应来设计的，PAP电池代表了精确的电池放电曲线作为无人机功耗概况的非线性功能。 GEE轨迹设计问题分为两个阶段：在第一个阶段，使用多层圆形填料方法找到了PAP的路径和可行位置，并使用替代方案计算所需的IRS相移值优化方法考虑了IRS元素的幅度和相位响应之间的相互依赖性；在第二阶段，使用新型的多轨迹设计算法计算PAP飞行速度和用户调度。数值评估表明：忽略Peukert效应高估了PAP的可用飞行时间；一定的阈值后，增加电池尺寸会减少PAP的可用飞行时间；与其他基线场景相比，IRS模块的存在改善了系统的GEE。与使用顺序凸编程和Dinkelbach算法的组合开发的单圈轨迹相比，多圈轨迹可节省更多的能量。

translated by 谷歌翻译